The Approach of Speaker Diarization by Gaussian Mixture Model (GMM)

نویسندگان

  • K. Rajendra Prasad
  • D. Jareena Begum
  • E. Lingappa
چکیده

Speaker identification is an important activity in the process of speaker diarization. We need to model the speaker by Gaussian mixture model (GMM) for speaker identification purpose. Large GMM is called as a Universal Background Model (UBM) which is adapted into each speaker model for speaker identification purpose. This paper focuses on speech clustering for speaker diarization. The speaker diarization includes the steps speech segmentation and the process of speech clustering. In speech segmentation, the features are extracted for each speech segment which is converted into Mel-FrequencyCepstralCoefficients (MFCC). Each speech segment is modeled by UBM adaptation. The relevant speech segments are grouped as speech clusters. This paper describes the speech segmentation, UBM adaptation, and speech clustering technique.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Speaker Diarization System Based on GMM and BIC

This paper presents an approach for speaker diarization based on a novel combination of Gaussian mixture model (GMM) and standard Bayesian information criterion (BIC). Gaussian mixture model provides a good description of feature vector distribution and BIC enables a proper merging and stopping criterion. Our system combines the advantage of these two method and yields favorable performance. Ex...

متن کامل

On the use of GSV-SVM for Speaker Diarization and Tracking

In this paper, we present the use of Gaussian Supervectors with Support Vector Machines classifiers (GSV-SVM) in an acoustic speaker diarization and a speaker tracking system, compared with a standard Gaussian Mixture Model system based on adapted Universal Background Models (GMM-UBM). GSVSVM systems (which share the adaptation step with the GMMUBM systems) are observed to have comparable perfo...

متن کامل

Improving Speaker Diarization

This paper describes the LIMSI speaker diarization system used in the RT-04F evaluation. The RT-04F system builds upon the LIMSI baseline data partitioner, which is used in the broadcast news transcription system. This partitioner provides a high cluster purity but has a tendency to split the data from a speaker into several clusters when there is a large quantity of data for the speaker. In th...

متن کامل

KL realignment for speaker diarization with multiple feature streams

This paper aims at investigating the use of Kullback-Leibler (KL) divergence based realignment with application to speaker diarization. The use of KL divergence based realignment operates directly on the speaker posterior distribution estimates and is compared with traditional realignment performed using HMM/GMM system. We hypothesize that using posterior estimates to re-align speaker boundarie...

متن کامل

Fast Speaker Diarization Using a Specialization Framework for Gaussian Mixture Model Training

Most current speaker diarization systems use agglomerative clustering of Gaussian Mixture Models (GMMs) to determine “who spoke when” in an audio recording. While state-of-the-art in accuracy, this method is computationally costly, mostly due to the GMM training, and thus limits the performance of current approaches to be roughly real-time. Increased sizes of current datasets require processing...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014